#Part1 Let the setup and notation be as stated in the assignment.
From the definition of the expected residual life, \(e_i(0) = \int_0^\infty S_i(u|0)du = \int_0^\infty \frac{S_i(u)}{S_i(0)}du = \int_0^\infty S_i(u)du\), since \(S_i(0) = 1\). For ease of notation write \(M_i = M_i(u) = \int_0^u\mu_i(z)dz, \quad i = 1,2\) we have
\[\begin{align*} e_2(0) - e_1(0) &= \int_0^\infty e^{-M_2}du - \int_0^\infty e^{-M_1}du \\ &= \int_0^\infty e^{-M_2} - e^{-M_1}du \\ &= \int_0^\infty \left[ e^{M_1-M_2} - 1\right] e^{-M_1}du \end{align*}\] Now, since \(e^{-M_1} = S_1(u)\) and noting that \[ \frac{d}{du} -S_1(u)e_i(u) = \frac{d}{du} -S_1(u)\int_u^\infty \frac{S_1(z)}{S_1(u)}du = \frac{d}{du} -\int_u^\infty S_1(z)dz = S_1(u),\]
by the fundamental theorem of calculus and switching limits of the integral. Then using integration by parts we get \[\begin{align*} \int_0^\infty \left[ e^{M_1-M_2} - 1\right] e^{-M_1}du &= \left[ (e^{M_1-M_2}-1)(-S_1(u)e_1(u)) \right]_0^\infty - \int_0^\infty (M_1 - M_2)^\prime e^{M_1 - M_2}(-S_1(u)e_1(u))du \\ &= \int_0^\infty (\mu_1(u) - \mu_2(u)) e^{\int_0^u \mu_1(v) - \mu_2(v)dv}S_1(u)e_1(u)du \end{align*}\]
source("Code/structure.R")
Question 1.7: Compute and visualize the period life expectancy at birth for both sexes over the period 1835 − 2020.
We start by creating an HMD object, with Dansih data, and using the corresponding class function “getSmoothMu” to get the smoothed mortality surface.
HMDobj <- HMDdatClass$new(
Otxtfile = 'Data/DNK_Deaths_1x1.txt',
Etxtfile = 'Data/DNK_Exposures_1x1.txt'
)
surf_obj <- HMDobj$getSmoothMu()
From the surf object, we can use the aux-function “surf2lifeexp” to calculate Period life expectancies for all groups, ages = 0, and all times. Then plot using the aux-function life.exp.base
lifeexp_obj <- surf2lifeexp(surf = surf_obj, agelim = 0, type = "period")
plot_dat_lifeexp <- reshape2::melt(lifeexp_obj$lifeexp , varnames = c("Group", "Age", "Year"), value.name = "PeriodLifeExp")
ggplot(data = plot_dat_lifeexp, aes(x = Year, y = PeriodLifeExp, fill = Group)) + geom_bar(stat = "identity", position = position_dodge())
ggplot(data = plot_dat_lifeexp, aes(x = Year, y = PeriodLifeExp, col = Group)) + geom_line()
Question 1.8: Compute and visualize, using e.g. a bar plot, the average annual rate of mortality improvement for both sexes and the age groups and periods stated above.
Mortality improvement rates is defined in Vaupel & Romo (2003) eq. (4) as \[\rho(a,t) = -\acute{\mu}(a,t) = -\frac{\frac{\partial}{\partial t}\mu(a,t)}{\mu(a,t)} = - \frac{d}{dt} log \mu(a,t)\].
While the average pace of mortality improvement is defined in eq. (6): \[\bar{\rho}(t) = \int_0^\infty \rho(a,t) \mu(a,t) S(a,t) du = \int_0^\infty \rho(a,t) f(a,t) da\]
Where \(f(a,t)\) is the distribution of deaths among the period \(t\) cohort.
Since we get data by 1x1 format, we are not able to get analytic expressions for the derivative wrt time or age for that matter.
Hence we will use the observed data for the distribution of deaths, and the estimates from the smoothed \(\mu\)-surface, then using the approximation \[ \rho(a,t) \approx \frac{\mu(a, t+1)-\mu(a,t)}{\mu(a,t)} \] We will use the smoothed \(\mu(a,t)\) from the ‘surf’ object, and get the deaths from the HMD objects.
#get deaths
deaths <- HMDobj$getOEdata()
#reshape to long format, add groups
arr_long <-
reshape2::melt(deaths$O) %>% rename(Deaths = value) %>%
filter(Year %in% 1835:2020) %>%
mutate(Age_grp = cut(Age,
breaks=c(-Inf,0,10*(1:9),Inf),
include.lowest=TRUE,
labels=c("0",paste(10*(0:8)+1, 10*(1:9), sep = "-"),"91+")),
Year_grp = case_when(Year %in% 1835:1900 ~ "1835-1900",
Year %in% 1900:1950 ~ "1900-1950",
Year %in% 1850:2020 ~ "1950-2020")) %>%
arrange(Year, Group, Age)
#calculate normalized death-distribution across time, gender, and age-groups
arr_long_2 <-
arr_long %>% group_by(Year, Group, Age_grp) %>%
mutate(normalised = Deaths/sum(Deaths))
#Let’s plot to get overview of data
ggplot(arr_long %>% filter(Group == "Female") %>% mutate(Age_mod = Age %% 10),
aes(x = Year, y = Deaths)) +
geom_line(aes(group = Age, col = factor(Age_mod)), alpha = 0.4) +
labs(title = "Females: Deaths, facet'd by age groups, coloured by Age",
col = "Age modulus 10") +
scale_color_discrete() +
facet_wrap(vars(Age_grp), scales = "free")
ggplot(arr_long %>% filter(Group == "Male") %>% mutate(Age_mod = Age %% 10),
aes(x = Year, y = Deaths)) +
geom_line(aes(group = Age, col = factor(Age_mod)), alpha = 0.4) +
labs(title = "Males: Deaths, facet'd by age groups, coloured by Age",
col = "Age modulus 10") +
scale_color_discrete() +
facet_wrap(vars(Age_grp), scales = "free")
#calculate change from year t -> t+1 in mortality, using surf object
rel_diff_surf <-
-(surf_obj$mu[,,dimnames(surf_obj$mu)$Year %in% as.character(1836:2020)] -
surf_obj$mu[,,dimnames(surf_obj$mu)$Year %in% as.character(1835:2019)]) /
surf_obj$mu[,,dimnames(surf_obj$mu)$Year %in% as.character(1836:2020)]
#Make into long format
rel_diff_surf_long <- reshape2::melt(rel_diff_surf) %>% arrange(Year, Group, Age) %>% rename(rel_diff_mu = value)
#combine Deaths data set with discrete rho
comb <- left_join(x = arr_long_2, y = rel_diff_surf_long, by = c("Year", "Group", "Age"))
comb_2 <-
comb %>% group_by(Year,Year_grp, Group, Age_grp) %>%
summarise(rho_bar = sum(rel_diff_mu * normalised)) %>%
group_by(Group, Age_grp) %>% mutate(mean = mean(rho_bar, na.rm = T))
## `summarise()` has grouped output by 'Year', 'Year_grp', 'Group'. You can override using the `.groups` argument.
#plot
ggplot(comb_2, aes(x = Year, y = rho_bar, col = Age_grp)) +
geom_line(alpha = 0.6) + geom_hline(mapping = aes(col = Age_grp, yintercept = mean)) +
facet_grid(cols = vars(Group), scales = "free") +
geom_abline(slope = 0, intercept = 0, linetype = "dashed")
## Warning: Removed 11 row(s) containing missing values (geom_path).
ggplot(comb_2, aes(x = Year, y = rho_bar, col = Group)) +
geom_line() + geom_hline(mapping = aes(yintercept = mean, linetype = Group)) +
facet_grid(rows = vars(Age_grp), scales = "free") +
geom_abline(slope = 0, intercept = 0, linetype = "dashed")
## Warning: Removed 2 row(s) containing missing values (geom_path).
It is evident from 1.8 that e(0) has been steadily increasing throughout the last 150 years, with some variation, especially prior til 1925.
#Question 1.10
#function to calculate eq.2